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Abstract. Kloks, Kratsch, and Spinrad showed how treewidth and min- 
imum-fill, NP-hard combinatorial optimization problems related to min- 
imal triangulations, are broken into subproblems by block subgraphs 
defined by minimal separators. These ideas were expanded on by Bou- 
chitte and Todinca, who used potential maximal cliques to solve these 
problems using a dynamic programming approach in time polynomial in 
the number of minimal separators of a graph. It is known that solutions 
to the perfect phylogeny problem, maximum compatibility problem, and 
unique perfect phylogeny problem are characterized by minimal triangu- 
lations of the partition intersection graph. In this paper, we show that 
techniques similar to those proposed by Bouchitte and Todinca can be 
used to solve the perfect phylogeny problem with missing data, the two- 
state maximum compatibility problem with missing data, and the unique 
perfect phylogeny problem with missing data in time polynomial in the 
number of minimal separators of the partition intersection graph. 

1 Introduction 

The perfect phylogeny problem, also called the character compatibility problem, 
is a classic NP-hard [5)26] problem in phylogenetics ill 25 . Characters that 
have a perfect phylogeny are called homoplasy-free, i.e. they map to a tree with 
no horizontal evolutionary events such as recombination or gene transfer. For a 
collection of partially labeled (a.k.a. missing data) unrooted trees, one can con- 
struct characters that have a perfect phylogeny precisely when the collection has 
a compatible supertree |25j . The more general problem of supertree estimation 
is of wide interest. 

Solutions to the perfect phylogeny problem are characterized by the exis- 
tence of restricted (minimal) triangulations of the partition intersection graph 
[10 21 26 , and minimal triangulations of the partition intersection graph also 
play an important role in two variants of this problem. The first, the maximum 
compatibility problem, asks to find the largest subset of a set of given characters 
that has a perfect phylogeny |7I15| , and the second, asks if a set of characters has 
a unique perfect phylogenjQ [2413.. Interestingly, the unique perfect phylogeny 

1 When a set of characters C has a unique perfect phylogeny, it is also common in the 
literature to say that C defines an A— tree. 



problem is NP-hard even when a perfect phylogeny for the characters is given 
[6 16 . Despite considerable advances in the field of minimal triangulations, to 
our knowledge these results have not been extended to the aforementioned prob- 
lems, although the use of such methods to solve at least the perfect phylogeny 
problem may have been alluded to (see p. 2 of [H]). 

Bouchitte and Todinca [9 a used potential maximal cliques to create the first 
algorithm that solves minimum- fill and treewidth in time polynomial in \Aq\, 
and this algorithm was improved upon in [12 . In this paper, we show how 
to extend the potential maximal clique approach to solve the perfect phylogeny 
problem, the maximum compatibility problem, and the unique perfect phylogeny 
problem. This approach is motivated by the following: first, the algorithms in 
[8 12 run in time polynomial in the number of minimal separators of the graph, 
and second, that data generated by the coalescent-based program ms jTHj often 
results in a partition intersection graph with a reasonable number of minimal 
separators |14j , despite there being an exponential number of minimal separators 
in general. In order to unify our approach, we use a weighted variant of the well- 
studied minimum-fill problem, which is NP-hard [27] and is an active area of 
research |4ll2j . 

Given full characters (a.k.a. complete data), the perfect phylogeny problem is 
solvable in polynomial time when the number of characters is fixed 20 or when 
the number of parts is bounded £Q . Our results apply to the most general setting, 
where the characters may be partial (a.k.a. missing data), and each character has 
unbounded parts (a.k.a. unbounded maxstates). See |17) for a survey on minimal 
triangulations, |HI25j for further reading on the perfect phylogeny / character 
compatibility problem, and [13] for further reading on unique perfect phylogeny. 

2 Definitions and results 

An X—tree is a pair T = (T,(f>), where T is an undirected tree, and <f> is a 
mapping from X to the nodes of T such that every node of T with degree two 
or one is mapped to by <j>. A character on X is a partition \ — ^ll^bl ■ ■ • \A r 
of a subset of X. For i = 1,2, ... ,r the set Ai is a cell of \- Given a cell A 
of a character, the minimal subtree of T that connects 4>(A) is denoted T(A). 
An A— tree T displays a character x if, for each pair of distinct cells A and A' 
of x, the trees T(A) and T(A') have no nodes in common. Given a set C of 
characters, the perfect phylogeny problem is to determine if there is an X—tree 
T that displays every character in C. In this case, we call T a perfect phylogeny 
for C, and say that C is compatible. 

The perfect phylogeny problem reduces to a graph theoretic problem that we 
detail now. A graph is chordal if any cycle it has on four or more vertices has a 
chord, that is, an edge between two non-consecutive vertices in the cycle. When 
G is not chordal, we may add edges to G to obtain a chordal supergraph H that 
is called a triangulation of G. The edges added to G to obtain H are called fill 
edges of H . When no proper subset of H's fill edges can be added to G to obtain 
a triangulation, we call H a minimal triangulation of G. 
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Fig. 1. An X— tree T displaying C = {afccde/|g/i|ij|fcZ, ag|dj|/i, 6/i|ci|efc} and the cor- 
responding partition intersection graph int(C). We use A to denote abcdef, and have 
labeled the vertices by their cells. The character abcdef \gh\ij\kl distinguishes the edges 
of T marked with dashes. Removing these edges results in the four subtrees defined by 
T '{abcdef), T(gh), T(ij), and T(kl). The dashed edges of int(C) define a proper tri- 
angulation, and the solid edges are obtained by cell intersection. If we replace ag\dj\fl 
and bh\ci\ek with the characters ag\bh, ci\dj, and ek\fl, we would obtain a partition 
intersection graph isomorphic to int(C) but with a different coloring. In that case, there 
is no proper triangulation because each four cycle has only two colors, and the fill edge 
ag,bh is monochromatic. Note that T does not display ag\bh, ci\dj, or ek\fl. 



Given a set of characters C, the partition intersection graph int(C) is the graph 
with vertex set {(A, x) I X & C and A is a cell of and two vertices (A, x) and 
(A', x') are adjacent in int(C) if and only if A and A' have non-empty intersection. 
If A\ and A 2 are cells of a character x, then Ax and A 2 are disjoint because x is a 
partition of a subset of X, so (Ax, x) and (A 2 , x) are n °t adjacent in int(C). The 
vertex (A, %) has cell A and character x- A triangulation of int(C) is proper if, 
for each fill edge, the vertices involved in the fill edge have different characters. 
This may be viewed as coloring each vertex (^4, x) of int(C) by its character 
resulting in a properly colored graph, and then proper triangulations are those 
whose fill edges preserve the proper coloring. If u and v are vertices of int(C) 
that have the same character/color, we say that u and v are monochromatic. If 
a triangulation of int(C) has uv as an edge, we say that uv is a monochromatic 
fill edge of the triangulation. See Figure [l] for an example of these concepts. 

For the remainder of this section, we characterize solutions to perfect phy- 
logeny problems as constrained minimal triangulations of the partition inter- 
section graph, and state our algorithmic results. These problems will then be 
discussed in terms of minimum- weight minimal triangulations in Section 2, and 
we prove our computational results in Section 3, all of which rely on Algorithm 
[l] The connection between triangulations and perfect phylogeny stems from the 
following result. 



Theorem 1. \10\21\2b^ Let C be a set of characters on X . Then C is compatible 
if and only i/int(C) has a proper minimal triangulation. 

While Theorem [l] was not originally stated in terms of minimal triangulations, 
it follows from the definitions that there is a proper triangulation if and only if 
there is a proper minimal triangulation. The set of minimal separators of int(C) 
are denoted Ant(C) (their definition appears in Section 3). Our first algorithmic 
result is the following theorem. 

Theorem 2. Let C be a set of characters on X with at most r parts per char- 
acter. There is an 0(\X\\C\ 2 + (r|C|) 4 |Z\i nt (c) | 2 ) time algorithm that solves the 
perfect phylogeny problem. 

If C is not compatible, then the maximum compatibility problem is to deter- 
mine the largest subset C* of C that is compatible, and C* is an optimal solution. 
In order to characterize solutions to the maximum compatibility problem in 
terms of minimal triangulations, we must consider non-proper triangulations of 
the partition intersection graph. We say x is broken by a fill edge (A,x)(A ; ,x) 
because x is the shared character of both vertices. Given a triangulation H of 
int(C), the displayed characters of H are the characters of C that are not broken 
by any fill edge. 

Theorem 3. Q-f 5)/ Let C be a set of characters on X . Then C* is an optimal 
solution to the maximum compatibility problem if and only if there is a minimal 
triangulation H* o/int(C) that has C* as its displayed characters, and for every 
other minimal triangulation H' o/int(C) with displayed characters C , \C'\ < \C*\. 

Given a set of characters C, a character weight is a function w from C to the 
positive real numbers (i.e. excluding zero). For a subset C of C, define w(C) = 
£ x ec w (x)- The w— maximum compatibility problem is to find a the subset C* of 
C such that w(C*) = maxw(C'), where the maximum is taken over all compatible 
subsets C of C. We generalize Theorem[3]below, and reserve its proof for Section 
2. 

Theorem 4. Let C be a set of characters on X with character weight w. Then 
C* is an optimal solution to the w— maximum compatibility problem if and only 
if there is a minimal triangulation H* of int(C) that has C* as its displayed 
characters, and for any other minimal triangulation H' o/int(C) with displayed 
characters C , w(C) < w(C*). 

Our second algorithmic result is for two-state characters only. Such characters 
are interesting because they are related to finding compatible supertrees. In 
that context, an optimal solution to maximum compatibility corresponds to a 
supertree that agrees with the most edges from the partially labeled trees given 
as input. 

Theorem 5. Let C be a set of (w— weighted) two-state characters on X, i.e., 
each x € C has two cells. There is an 0{\X\\C\ 2 + \C\ |Zi; n t(c)| 2 ) time algorithm 
that solves the (w-)maximum compatibility problem. 



The unique perfect phylogeny problem is to determine if a perfect phylogeny 
for a set of characters is the only perfect phylogeny for those characters. An edge 
uv of an X— tree T is distinguished by a character x if contracting uv results 
in an X— tree that does not display \, and T is distinguished by C if each edge 
of T is distinguished by a character of C. An X— tree T = (T, <f>) is ternary if 
every internal node of T has degree three. Semple and Steel characterized the 
existence of a unique perfect phylogeny as follows. 

Theorem 6. \24Jj LetC be a set of characters on X. ThenC has a unique perfect 
phylogeny T = (T, 4>) if and only if the following conditions hold: 

1. there is a ternary perfect phylogeny T = (T,(f>) for C and T is distinguished 

by C; 

2. int(C) has a unique proper minimal triangulation. 

It is well known how to create a perfect phylogeny T = (T, <fi) for C from 
a clique tree of a proper minimal triangulation in polynomial time (e.g. see the 
proof of Lemma 5.1 in [7]), an d a clique tree of a chordal graph can be computed 
in linear time [3j . Checking if T is ternary and distinguished by C is also easy to 
do: an edge uv is distinguished by x if an d only if u is a node of T(A) and v is 
a node of T(A) for distinct cells A, A' of x- So if it is known that int(C) has a 
unique proper minimal triangulation, it is possible to determine if C has a unique 
perfect phylogeny in polynomial time. On the other hand, it has recently been 
shown |6ll6j that if a perfect phylogeny is given for a set of characters, it is still 
NP-hard to determine if it is the unique perfect phylogeny for those character^ 
That is, determining if int(C) has a unique proper minimal triangulation is NP- 
hard [16] . This makes our last algorithmic result of interest. 

Theorem 7. Let C be a set of characters on X with at most r parts per charac- 
ter. There is an 0(\X\\C\ 2 + (r|C|) 4 |Z\ int (c) | 2 ) time algorithm that determines if 
int(C) has a unique proper minimal triangulation, i.e. it solves the unique perfect 
phylogeny problem. 

3 Characterizations via weighted minimum-fill 

In this section, we characterize solutions to the perfect phylogeny problem and 
maximum compatibility problem as a weighted- variant of the minimum-fill prob- 
lem, which asks for the fewest number of edges required to triangulate a graph. 
A similar characterization will be given for solutions to the unique perfect phy- 
logeny problem, that has an additional requirement on the minimal separators 
involved in zero-weight minimal triangulations. In order for our results to be 
useful in the next section, each result will be given with respect to minimal 
triangulations. 



2 These papers show that this problem is NP-hard even when the characters are quartet 
trees, which in our setting correspond to characters of the form ab\cd. 



Suppose G is a non-complete graph. If U is a subset of G's vertices, then 
the potential fill edges pf (U) of U are pairs of vertices of U that are not edges 
of G. A fill weight on G = (V, £7) is a function F w from pf (V) to the non- 
negative real numbers, i.e., including zero. For a triangulation H of G with 
fill weight F w , the weight of is F W (H) = ^2F w (f) where the sum occurs 
over all fill edges of H. We will call H a F w — minimum triangulation of G if, 
for every other triangulation H' of G, F W (H) < F W (H'). In this case we write 
mfip w (G) — F W (H). If F W (H) = 0, then H is a F w — zero triangulation of G. If £f 
is a -F«,— minimum or i 7 ^— zero triangulation that is also a minimal triangulation 
of G, then H is a F w — minimum minimal triangulation or F w —zero minimal 
triangulation, respectively. Note that if a F w — zero triangulation exists, it must 
be a i 7 ^,— minimum triangulation. Additionally, because F w is non-negative, there 
is always a minimal triangulation that is a i 7 ^ — minimum triangulation. 

Definition 1. Let C be a set of characters on X . Then Ic is the fill weight of 
int(C) defined by 



Observation 1 Let C be a set of characters on X . Then a triangulation H of 
int(C) is proper if and only if Ic{H) = 0. 

Lemma 1. A collection C of characters on X are compatible if and only ifmt(C) 
has a Ie-zero minimal triangulation. 



The following two lemmas, which follow from results in |7ll5j . will be helpful 



Lemma 2. Suppose C is a set of characters and C C C is compatible. Then 
there is a minimal triangulation of int(C) and C is a subset of its displayed 
characters. 

Lemma 3. Suppose C is a set of characters and H is a triangulation o/int(C). 
Then the displayed characters of H are a compatible subset of C . 

(Proof of Theorem^ Let C* be an optimal solution to the w— maximum 
compatibility problem. By Lemma [2j there is a minimal triangulation H * of 
int(C) that has at least C* as its displayed characters. Displayed character sets 
are compatible by Lemma [3j so by positivity of w and optimality of C* , the 
displayed characters of H* are exactly C* . If H' is another minimal triangulation 
of int(C) with displayed character set C(H'), then C(H') is compatible by Lemma 
[3| so w(C(H')) < w(C*) by optimality of C* . 

For the converse, let H be a minimal triangulation of int(C) with displayed 
characters C(H), and suppose w(C(H)) is greater than the weight of the displayed 
characters of any other minimal triangulation of int(C). Then w(C*) < w(C(H)) 




1 if u and v are monochromatic; 
otherwise. 



Proof. The lemma follows from Theorem [T] and Observation [T] 



□ 



for proving Theorem [4j 



because C* are the displayed characters of H*. By Lemma [3] the set C(H) is 
compatible, so w(C*) = w(C(H)) by optimality of C* . Therefore C(H) is an 
optimal solution. □ 



Definition 2. Let C be a set of characters on X that are weighted by w. Then 
the fill weight F w o/int(C) induced by w is 

^ , \ I w(x) if u are v monochromatic and colored by y: 
F w {uv) = { yAJ J , h y A ' 

II) otherwise. 

Lemma 4. Let C be a collection of two-state characters weighted by w, and 
suppose H is a triangulation of int(C) with displayed characters C(H). Then 
w(C) = F w (H) + w{C(H)). 

Proof. For each y in C there is exactly one potential fill edge uv of int(C) such 
that u and v are monochromatic with shared character \ because \ has two 
states. In particular, if x = A\A' then u = (A,x) and v — (A',x)- Hence 
there is a one-to-one correspondence between characters in C and potential fill 
edges stemming from monochromatic pairs of vertices of int(C). Further, each 
monochromatic pair of vertices is either a fill edge of H, or it corresponds to a 
displayed character of H. Any other potential fill edge u'v' of int(C) that does not 
arise in this way is not monochromatic, and in this case F w (u'v') = 0. Letting 
X be the set of monochromatic potential fill edges of int(C), we have 

10(C) - ]T F w {f) 
/ex 

fexnE(H) fex-E(H) 
= F w (H) + w(C(H)) . 

□ 

Theorem 8. Let C be a collection of two-state characters weighted by w. Then 
C* is a w— maximum compatible subset of C if and only if there is a F w -minimum 
minimal triangulation H* o/int(C) that has C* as its displayed characters. 

Proof. Suppose that C* is a w— maximum compatible subset of C. By Theo- 
rem[4j there is a minimal triangulation H* of int(C) that has C* as its displayed 
characters. For the sake of contradiction suppose H* is not a F w — minimum min- 
imal triangulation, so there is a triangulation H of int(C) such that F W (H) < 
F W (H*). Letting C(H) be the displayed characters of H, by Lemma|4]we have 
w{C) - w(C{H)) < w(C) - w(C*) and therefore w{C*) < w{C(H)). This contra- 
dicts the optimality of C* , so H* must be a F w — minimum minimal triangulation. 

Now let H' be a F w — minimum minimal triangulation of int(C) with displayed 
characters C(H'). Then F W (H') = F W {H*) by F w — minimization, and w(C) — 
w{C{H')) = w{C) - w{C*) by Lemmagso w{C{H')) = w{C*). The set C{H) is 
compatible by Lemma pi so C(H) is an optimal solution. □ 



The weighted maximum compatibility problem can be used to solve the max- 
imum compatibility problem by using the character weight where each x G C 
has weight one. This character weighting induces the fill weight Ic, giving the 
following corollary. 

Corollary 1. Let C be a collection of two-state characters. Then C* is a maxi- 
mum compatible subset of C if and only if there is a Ic -minimum minimal tri- 
angulation H* o/int(C) that has C* as its displayed characters. 

We conclude this section by characterizing solutions to unique perfect phy- 
logeny. Let G = (V, E) be an undirected graph and S C V. We will use G — S 
to denote the graph obtained from G by removing the vertices S and edges that 
are incident to a vertex in S. If x, y are connected vertices in G but disconnected 
in G — S, then S is an xy— separator. When no proper subset of S is also an 
xy— separator, then S is a minimal xy— separato^^ If there is at least one pair 
of vertices x and y such that S is a minimal xy— separator, then it is a minimal 
separator of G. The set of minimal separators of G is denoted by Aq- Suppose 
<P is a subset of G's minimal separators. The graph G$ is obtained from G by 
adding the fill edge uv whenever uv G pf(S') for some S in <I>, and we say G is 
obtained by saturating each minimal separator in <S>. The following fundamental 
result characterizes the minimal triangulations of a graph in terms of its minimal 
separators. 

Theorem 9. [22 231 see also f!9f Let G a graph and Aq its minimal separators. 
If H is a minimal triangulation of G, then Ah is a maximal pairwise-parallel 
set of minimal separators of G and H = Ga h ■ Conversely, if is any maximal 
pairwise-parallel set of minimal separators of G, then G$ is a minimal triangu- 
lation of G and Aq = 

An important observation from this theorem is that if H is a minimal tri- 
angulation of G, then Ah C Aq. Let C be a set of characters and F w be a fill 
weight on int(C). We will use Ap ln to denote the set of minimal separators S 
of int(C) such that there is a F w — minimum minimal triangulation H of int(C) 
with S € Ah- 

Theorem 10. Suppose C is a collection of characters on X . Then int(C) has a 
unique proper minimal triangulation if and only if 

1. int(C) has a Ic—zero minimal triangulation; and 

2. Z\"" n is a maximal set of pairwise-parallel minimal separators o/int(C). 

Proof. Suppose int(C) has a unique proper minimal triangulation H* . By Ob- 
servation [l] it is a Iq— zero minimal triangulation of int(C), and each minimal 
separator of H* is a minimal separator of int(C) by Theorem[9j so Ah- Q A™ n . 

3 Note that a minimal xy— separator S is defined with respect to x and y. That is, it 
may be that there is a different pair of vertices u,v of G such that S is a non-minimal 
uv— separator. 



Alternatively, if S £ Af™, then S is a minimal separator of a Iq— zero minimal 
triangulation of int(C). This minimal triangulation is proper by Observation [TJ 
so S € by uniqueness. Therefore Af 1 " 1 = Ah*, and zi™ n is a maximal 

pairwise-parallel set of minimal separators of int(C) by Theorem [9] 

To prove the converse, suppose that int(C) has a Iq— zero minimal triangula- 
tion, and zV 7 " ln is a maximal set of pairwise-parallel minimal separators of int(C). 
By Theorem |9j the graph H obtained from int(C) by saturating each minimal 
separator in Z\™ n is a minimal triangulation of int(C), and further, for each fill 
edge uv of H, there is a S' € /V 7 " m such that u,v E <S" . By definition there is 
some ie — zero minimal triangulation that has S' as a minimal separator. This 
triangulation has uv as a fill edge by Theorem [9] so Ic{uv) — 0. Therefore if is 
an ic— zero minimal triangulation of int(C), and by Observation [TJ if is a proper 
minimal triangulation of int(C). 

Now let H' be any proper minimal triangulation of int(C). By Observation [TJ 
H' is an Iq— zero minimal triangulation of int(C), so Ah> Q A™ n . We assumed 
Z\"" n is pairwise-parallel, and Ah 1 is maximal with respect to being pairwise- 
parallel by Theorem [9J so Ajji = A 1 ^™. Thus both H and if' are obtained 
from int(C) by saturating each minimal separator of Ah> = AJ^ n , so H' = H. 
Therefore H is the unique proper minimal triangulation of int(C). □ 

4 Finding weighted minimum triangulations 

In this section we show that, given a fill weight F w for G, both mfi^ ro (G) and 
Z\™ n can be computed in 0{\X\\C\ 2 + (r|C|) 4 | Ant(c) ?) time. AfteAhat, we 
present proofs of our algorithmic results. 

Given a graph G and X C V, a set C C V — X is a connected component 
of G — A if it is connected in G — A and it is maximal with respect to this 
property. A block of a graph G is a pair G) where S" S Zl^ an d G is a 
connected component of G — S, and it is full or /m// wit/i respect to S if every 
vertex of S has at least one neighboring vertex that is in G (we write N(C) = S). 
The realization of a block (5, G) is the graph R(S, C) with vertex set SUC, and 
for any u and w in S U G, to is an edge of R(S, C) if either uv is an edge of G 
or uv G pf(<S). 

Kloks, Kratsch, and Spinrad [TJJ] showed that the minimal triangulations of 
G that have S € Z\g as a minimal separator (i.e. S is saturated to obtain the 
minimal triangulation) can be obtained by independently minimally triangulat- 
ing R(S, C) for each connected component G of G — S. They used this fact to 
relate minimum fill to the realizations of the blocks of a minimal separator, an 
important first consideration for computing minimum fill using potential maxi- 
mal cliques and minimal separators. We extend this fact to weighted-minimum 
fill with the following lemma, whose proof follows with a slight modification of 
the proof of Theorem 3.4 in [TJ|, so we omit it. 

Lemma 5. Let G be a non-complete graph and F w be a fill weight on G. Then 
mft Fw (G) = min (fill Fro (5) + mfi ^ ( R( - S ' C ))) 



where the sum occurs over the connected components C of G — S and 

m\ Fw (s)= ■ 

/epf(s) 

It turns out that non-full blocks with respect to S € Aq are full blocks with 
respect to a different minimal separator of G. They also allow us to compute 
td&f w (R(S, C)), which is a useful fact for later when we restrict our attention to 
full blocks of G. 

Lemma 6. Let G be a graph, S € A G , and C be a connected component 
ofG-S. If N(C) = S' C S, then (S', C) is a full block of G (i.e. S' € A G ). 
Further, if E' C pf(C), then the graph obtained from R(S,C) by adding the 
fill edges in E' is a minimal triangulation of R(S, C) if and only if the graph 
obtained from R(S' ', C) by adding the fill edges in E' is a minimal triangulation 
ofR(S,C). 

This gives us the following, an extension of Corollary 4.5 in [3]. 

Corollary 2. Let G be a graph, S G Aq, and C be a connected component of 
G-S. If N(C) = 5" C S, then w.& Fv> (R(S,C)) = m& Fw (R(S',C)) for any fill 
weight F w . 

In order to compute mfi Fm (R(S, C)), we need the notion of a potential max- 
imal clique. Let G be a graph and K be a subset of its vertices. Then K is a 
potential maximal clique of G if there is a minimal triangulation H of G and K 
is a maximal clique of H. That is, every pair of vertices in K are adjacent in 
H, and no proper superset of K has this property. The set of potential maxi- 
mal cliques of G is denoted by II a- The next two lemmas describe the interplay 
between potential maximal cliques, minimal separators, and blocks. 

Lemma 7. JSjj Let G be a graph and K be a potential maximal clique of G. Then 
S G Aq and S C K if and only if N(C) = S for some connected component C 
ofG-K. 

Therefore if K € TIq and C\ , C%, . . . , are the connected components of 
G-K, each (Si, C t ) where N(Ci) = S t is a full block of G (i.e. S t G A G ). These 
blocks are called the blocks associated to K . 

Lemma 8. J8jj Suppose G is a graph, S G Aq, and (S, C) is a full block. Then 
H(S, C) is a minimal triangulation of R(S, C) if and only if 

1. there is a potential maximal clique K of G such that S C K C (S, C); and 

2. letting (Si,Ci) for 1 < i < p be the blocks associated to K such that Si U 
Ci C S U C , we have E(H) = Ui=i E(Hi) U pf (K) where Hi is a minimal 
triangulation of R(Si, Ci) for each 1 < i <p. 

The following lemma is an extension of Corollary 4.8 in [jj]. For completeness, 
we provide a proof. 



Lemma 9. Let (S,C) be a full block of G and F w be a fill weight on G. Then 
mfi F „, (R(S, C))= min (W Fw (A) - W Fv (S) + ]T m& Fvl (R(S h d))) (1) 

o (Z -fv ^ ( o ,C< ) 

where the minimum is taken over all K G LIq such that S C A C (S,C), and 
(Si, Ci) are the blocks associated to K in G such that SiL) Ci C S U C . 

Proof. Let H(S,C) be a triangulation of R(S,C) such that mn F „ (R(S, C)) = 
F W (H(S,C)). Without loss of generality, we may assume H(S,C) is a minimal 
triangulation of R(S, C) because F w is non-negative. By Lemma |8j there is a 
potential maximal clique A such that S C K C (5, C) with blocks (Si,Ci) 
associated to A such that U C,: C S U C for 1 < i < p. Further, the fill 
edges of H(S, C) are disjointly obtained from the fill edges of Hi for 1 < i < 
p and pf(-ftT) — pf(<S) (because S is already saturated in R(S,C)). Therefore 

m& Fw (R(s,c)) = f w (h(s,c)) = m\ Fw (K) -m\ Fw (s) + Y:U p ™( H i)- 

Now, for a given 1 < k < p, suppose for the sake of contradiction that 
Hk is not a F w — minimum fill of R(Sk,Ck)- Then there is a minimal trian- 
gulation H' k of R(Sk;Ck) such that F w (H' k ) < F w (H k ). Further, the graph 
H'{S,C) with vertex set (S,C) and edge set E(H(S,C)) - E(H k ) U E(H' k ) 
is a minimal triangulation of R(S,C) by Lemma |8j and it has a weighted fill of 
F W (H'(S,C)) - F w (H(S,C))-F w (H k ) + F w {H' k ) < F W (H(S,C)). This contra- 
dicts the F w — minimality of H(S, C), so it must be that F w (Hk) = m& Fw (R(Sk, Ck)), 
and therefore mfi Ftu (R(S, C)) = fill^ (A") - fill Fra (5) + ELi mfi F„ Ci)). 
Letting LHS and RHS denote the left-hand side and right-hand side of equation 
respectively, we have shown that LHS > RHS. 

Now suppose K* e LJ G such that 5 C K C (5, C) and 

fiU F „ ( A* ) - fill Fra (5) + ^ mfi Fm (i?(S** , CX )) = RHS 

where (S*,C*) are the blocks associated to K* in R(S,C) for 1 < t < p*. 
For 1 < z < p*, let i?* be a minimal triangulation of R(S*,C*) such that 
F W (H*) = mfi Fw (R(S*,C*)). By Lemma § there is a minimal triangulation 
H*(S,C) of i?(5,C) obtained by adding the fill edges pf(Jf*) - pf(,S) and 
E(H*) — E(R(S*, C*)) for 1 < i <p*, and hence F W (H*(S,C)) ~i\\\ Fm (K*) - 
m Fw {S) +Emfi F „(i?(5*,C*)). Now, mfi Fm (R(S,C)) < F W (H*(S, C)) by defi- 
nition, so LHS < RHS and therefore LHS = RHS. □ 

Theorem 11. Let C be a set of partial characters on X with at most r parts per 
character, and F w be a fill weight on int(C). There is an 0(|X||C| 2 +(r|C|) 4 |Z\; nt (c)| 2 ) 
algorithm that computes mfi Fro (int(C)) and Z\™ m . 

Proof. Our approach is described in Algorithm [l] Constructing int(C) can be 
done in 0((|A| + r 2 )|C| 2 ) time as follows. There are at most r\C\ vertices of 
int(C), one per part of each character. Recall that a pair of vertices (A, x) and 
(A', x') of int(C) form an edge if and only if there is some sGifl A'. For each 
a e X, let C(a) be the vertices of int(C) whose cell contains a. These sets are 



Algorithm 1 



1: Input: Partial characters C on X with at most r parts 

2: Output: mfi F „ (int(C)) and Af™ 

3: compute int(C) 

4: compute Ant(C) and iI int ( C ) 

{ Find the F w — minimum fill value for each full block } 

5: compute all the full blocks (S, C) and sort them by the number of vertices 

6: for each full block (S, C) taken in increasing order do 

7: mfi F „ (R(S, C)) <- fill^ (5" U C) if (S, C) is inclusion-minimal 
and m&F w (R(S,C)) oo otherwise 

8: for each potential maximal clique K s.t. S C K C S U C do 

9: compute the blocks (Si, d) associated to K s.t. S, U C, C S U C 
10: newfill «- fill Fra (K) - fill F „ (5) + mfi^ d)) 
11: mfi Fra (ii(5, C)) <- min(mfi Fro C)), newfill) 

12: end for 
13: end for 
14: mfi F „ (int(C)) •(— oo 

{ Find the F w — minimum fill value for minimal triangulations containing S } 
15: for each minimal separator S of int(C) do 

16: compute the blocks (Si, d) associated to S where N(d) = Si 
17: mfi Fro (S) <- GRr v (S) + E, mB F „ (R(S h d)) 
18: mfi Flu (int(C) min(mfi Fro (int(C)), mfi F „ (5)) 
19: end for 

20: Af™ <r- {S e A Ic s.t. mfi F „(S) = mfi F „ (int(C))} 



computed in 0(|X||C|) amortized time by scanning each cell of each character. 
The edges of int(C) are now found by examining each pair (A\, Xi){-^2, X2) hi 
C(a) for all a e X. To address redundancy, order the characters and cells of each 
characters, then construct a table to check if (Ai,xi) and (A 2 ,X2) has already 
been found as an edge. Examining C(a) to find these edges takes 0(|C| 2 ) time 
(because a is in at most one cell per character) , and constructing the redundancy 
table takes 0((r\C\f ) time, for a total of 0((\X\ + r 2 )\C\ 2 ) time. 

For a general graph G with |V| vertices and \E\ edges, it is possible to com- 
pute A G in 0(\V\ 3 \A G \) time 2 and 7T G in 0(|F| 2 |£:||Z\ G | 2 ) time 0. Let n 
be the number of vertices of int(C). The full block computation and nested for 
loop can be implemented in 0(n 3 |77j nt (c) |) time, which follows from the proof of 
Theorem 3.4 in p2|. It is known that |7T G | < I^H^gI 2 + \V\\A G \ + 1 [9:, so the 
nested for loop takes 0(n 4 |zi int ( C ) | 2 ) time. 

Consider the second for loop and let S G A n t(C)- The blocks associated 
to S are found in 0(n 2 ) time by searching the graph to find the connected 
components, and then computing N(C) for each connected component C of G—S 
(in the second computation, each edge of the graph is examined at most once) . By 
Lemma[6j each (Si, C,) is a full block of G, so we have calculated mfip ra (R(Si, Ci)) 
during the first for loop. The calculation on line 17 matches the one in Lemma[5] 
because vcAf vj (R(S, Ci)) = mfLF m (R(Si, Ci)) by Corollary [2] It takes 0(n 2 ) time 
to compute fihV^S 1 ), so the second for loop takes 0(n 2 \A int ( C ) \) time. The last 



line of the algorithm takes 0(|Z\i nt (c) |) time. Aside from the 0(|A||C| 2 ) term, the 
bottleneck of the algorithm is the first nested for loop and calculating n int ( C ), 



so the entire algorithm runs in 0(|X||C|' ! 



(r|C|) 4 |A„t( C )| 2 ) time. 



□ 



Proof of Theorem^ By Lemma [T] it suffices to compute the Iq— minimum 



fill of int(C). This takes 0(\X\\C\ 2 + {r\C\f\A 



int(C) I 



time by Theorem 



11 



□ 



Proof of Theorem^ By Theorem [HJ it suffices to compute the F w — minimum 
/ -minimum fill of int(C). Each character has only two states, so this takes 

o(\x\\cf 



\C\*\A 



int(C_L 

Proof of Theorem^. By Theorem |10| it suffices to determine if Z\™ m is a 



time by Theorem 



11 



□ 



(r|C|) 4 |A„t( C )| 2 ) time by Theorem [llj 



are 



pairwise-parallel set of minimal separators. Computing Z\™ n takes 0(|A||C| 

To determine if S and S' <= Af™ 
parallel, we compute the connected components of G— S in linear time, and then 
count the number of connected components that have a vertex from S' . S and 5' 
are parallel if and only if this count is one. This takes at most 0((r|C|) 2 |Z\™ n | 2 ) 



time, and Af™ C Z\ in t(C) giving a total of 0(\X\ \C\ 2 + {r\C\) 4 \A 



int(C) 



') time. □ 



5 Discussion 



An immediate question is whether or not Theorem [5] can be extended to the 
case where r is unbounded. This does not seem possible to do for the following 
reason. Let S be a minimal separator that is a clique in a minimal triangulation 
of int(C) that is an optimal solution to the maximum compatibility problem, and 
Ci, C2, . . . , Cfc be the connected components of int(C) — S. Then the minimal 
triangulations of R(S, Cj) for 1 < i < k are dependent with respect to a fill weight 
F w , unlike the two-state CcLS6, <XS illustrated in Figure[2j It is possible to construct 
similar examples for unweighted characters. Hence any sort of separator-based 
approach for a given optimization function seems to require a decomposition 
property similar to that of Lemma [5] 
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